I saw that data can be splitted by 2012y.
Before, only two labels appear and after this year 3rd label is active.
I've decided here to train two models for different time ranges.
I've made synthetic data (~6k) using some augmentation technics (Rotatino, changing brightness, vertical shift).
It multiplied train dataset with new images, similar to real but giving new information to models.
It improves models accuracy by ~25 %-points
Obviously, there is some neccessary steps to do, however, I can't code everething in one day :)
I see two main steps to do: